Style-Code Method for Multi-Style Parametric Text-to-Speech Synthesis

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prosodic Reading Style Simulation for Text-to-Speech Synthesis

The simulation of different reading styles (mainly by adapting prosodic parameters) can improve the naturalness of synthetic speech and supports a more intelligent human machine interaction. The article exemplarily investigates the reading styles News and Tale. For comparison, all examined texts contained the same genre-neutral paragraphs which have been read without a specific style instructio...

متن کامل

Style-Specific Phrasing in Speech Synthesis

People pause between words and sentences when they speak. They pause to emphasize content, or to make an utterance more understandable, or just to take a breath. A speech synthesizer should also insert similar pauses to sound natural. The process of inserting prosodic breaks in an utterance is called Phrasing. Phrasing is a crucial step during speech synthesis because other models of prosody de...

متن کامل

Towards speaking style transplantation in speech synthesis

One of the biggest challenges in speech synthesis is the production of naturally sounding synthetic voices. This means that the resulting voice must be not only of high enough quality but also that it must be able to capture the natural expressiveness imbued in human speech. This paper focus on solving the expressiveness problem by proposing a set of different techniques that could be used for ...

متن کامل

Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

In this work, we propose “global style tokens” (GSTs), a bank of embeddings that are jointly trained within Tacotron, a state-of-the-art end-toend speech synthesis system. The embeddings are trained with no explicit labels, yet learn to model a large range of acoustic expressiveness. GSTs lead to a rich set of significant results. The soft interpretable “labels” they generate can be used to con...

متن کامل

Uncovering Latent Style Factors for Expressive Speech Synthesis

Prosodic modeling is a core problem in speech synthesis. The key challenge is producing desirable prosody from textual input containing only phonetic information. In this preliminary study, we introduce the concept of “style tokens” in Tacotron, a recently proposed end-to-end neural speech synthesis model. Using style tokens, we aim to extract independent prosodic styles from training data. We ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: SPIIRAS Proceedings

سال: 2018

ISSN: 2078-9599,2078-9181

DOI: 10.15622/sp.60.8